Goto

Collaborating Authors

 critical value


Nearly Optimal Bounds for Cyclic Forgetting

Neural Information Processing Systems

One challenge of continual learning is "catastrophic forgetting" [Had+20; VT19; Kem+18]: A model However, if contexts similar to A arise repeatedly, this may be undesirable.. In machine learning, many data sets display cyclic or periodic patterns.


An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems

Hao, Yuren, Wan, Xiang, Zhai, ChengXiang

arXiv.org Artificial Intelligence

In this paper, we introduce a systematic framework beyond conventional method to assess LLMs' mathematical-reasoning robustness by stress-testing them on advanced math problems that are mathematically equivalent but with linguistic and parametric variation. These transformations allow us to measure the sensitivity of LLMs to non-mathematical perturbations, thereby enabling a more accurate evaluation of their mathematical reasoning capabilities. Using this new evaluation methodology, we created PutnamGAP, a new benchmark dataset with multiple mathematically-equivalent variations of competition-level math problems. With the new dataset, we evaluate multiple families of representative LLMs and examine their robustness. Across 18 commercial and open-source models we observe sharp performance degradation on the variants. OpenAI's flagship reasoning model, O3, scores 51.5% on the originals but drops by 4.7 percentage points on surface-renaming variants, and by 12.9 percentage points on parametric variants, while smaller models fare far worse. Overall, the results show that the proposed new evaluation methodology is effective for deepening our understanding of the robustness of LLMs and generating new insights for further improving their mathematical reasoning capabilities.




Collective decision-making with higher-order interactions on $d$-uniform hypergraphs

Njougouo, Thierry, Carletti, Timoteo, Tuci, Elio

arXiv.org Artificial Intelligence

Understanding how group interactions influence opinion dynamics is fundamental to the study of collective behavior. In this work, we propose and study a model of opinion dynamics on $d$-uniform hypergraphs, where individuals interact through group-based (higher-order) structures rather than simple pairwise connections. Each one of the two opinions $A$ and $B$ is characterized by a quality, $Q_A$ and $Q_B$, and agents update their opinions according to a general mechanism that takes into account the weighted fraction of agents supporting either opinion and the pooling error, $α$, a proxy for the information lost during the interaction. Through bifurcation analysis of the mean-field model, we identify two critical thresholds, $α_{\text{crit}}^{(1)}$ and $α_{\text{crit}}^{(2)}$, which delimit stability regimes for the consensus states. These analytical predictions are validated through extensive agent-based simulations on both random and scale-free hypergraphs. Moreover, the analytical framework demonstrates that the bifurcation structure and critical thresholds are independent of the underlying topology of the higher-order network, depending solely on the parameters $d$, i.e., the size of the interaction groups, and the quality ratio. Finally, we bring to the fore a nontrivial effect: the large sizes of the interaction groups, could drive the system toward the adoption of the worst option.



7a674153c63cff1ad7f0e261c369ab2c-Supplemental.pdf

Neural Information Processing Systems

This is the appendix for "A mathematical model for automatic differentiation in machine learning". We propose to study backward mode of AD, as implemented for nonsmooth functions by standard software (e.g. Our theoretical results model AD as implemented in current machine learning libraries. The conclusion follows because f p y q f px q " For each i " 1 ...,m and j " 1,...,l, consider the set U We recall here the results of geometry that we use in the present work. The simplest o-minimal structure is given by the class of real semialgebraic objects. The following can be found for example in [21]. D p x q " tgrad f p xqu, (10) where grad f p x q is the gradient of f restricted to the active strata M Then the following are equivalent D is conservative for f .


Nearly Optimal Bounds for Cyclic Forgetting

Neural Information Processing Systems

One challenge of continual learning is "catastrophic forgetting" [Had+20; VT19; Kem+18]: A model However, if contexts similar to A arise repeatedly, this may be undesirable.. In machine learning, many data sets display cyclic or periodic patterns.



Kernel Two-Sample Testing via Directional Components Analysis

Cui, Rui, Li, Yuhao, Song, Xiaojun

arXiv.org Machine Learning

We propose a novel kernel-based two-sample test that leverages the spectral decomposition of the maximum mean discrepancy (MMD) statistic to identify and utilize well-estimated directional components in reproducing kernel Hilbert space (RKHS). Our approach is motivated by the observation that the estimation quality of these components varies significantly, with leading eigen-directions being more reliably estimated in finite samples. By focusing on these directions and aggregating information across multiple kernels, the proposed test achieves higher power and improved robustness, especially in high-dimensional and unbalanced sample settings. We further develop a computationally efficient multiplier bootstrap procedure for approximating critical values, which is theoretically justified and significantly faster than permutation-based alternatives. Extensive simulations and empirical studies on microarray datasets demonstrate that our method maintains the nominal Type I error rate and delivers superior power compared to other existing MMD-based tests.